48 research outputs found

    Locating bugs without looking back

    Get PDF
    Bug localisation is a core program comprehension task in software maintenance: given the observation of a bug, e.g. via a bug report, where is it located in the source code? Information retrieval (IR) approaches see the bug report as the query, and the source code files as the documents to be retrieved, ranked by relevance. Such approaches have the advantage of not requiring expensive static or dynamic analysis of the code. However, current state-of-the-art IR approaches rely on project history, in particular previously fixed bugs or previous versions of the source code. We present a novel approach that directly scores each current file against the given report, thus not requiring past code and reports. The scoring method is based on heuristics identified through manual inspection of a small sample of bug reports. We compare our approach to eight others, using their own five metrics on their own six open source projects. Out of 30 performance indicators, we improve 27 and equal 2. Over the projects analysed, on average we find one or more affected files in the top 10 ranked files for 76% of the bug reports. These results show the applicability of our approach to software projects without history

    Mining Version Histories for Detecting Code Smells

    Get PDF
    Code smells are symptoms of poor design and implementation choices that may hinder code comprehension, and possibly increase change- and fault-proneness. While most of the detection techniques just rely on structural information, many code smells are intrinsically characterized by how code elements change over time. In this paper, we propose HIST (Historical Information for Smell deTection), an approach exploiting change history information to detect instances of five different code smells, namely Divergent Change, Shotgun Surgery, Parallel Inheritance, Blob, and Feature Envy.We evaluate HIST in two empirical studies. The first, conducted on twenty open source projects, aimed at assessing the accuracy of HIST in detecting instances of the code smells mentioned above. The results indicate that the precision of HIST ranges between 72% and 86%, and its recall ranges between 58% and 100%. Also, results of the first study indicate that HIST is able to identify code smells that cannot be identified by competitive approaches solely based on code analysis of a single system’s snapshot. Then, we conducted a second study aimed at investigating to what extent the code smells detected by HIST (and by competitive code analysis techniques) reflect developers’ perception of poor design and implementation choices. We involved twelve developers of four open source projects that recognized more than 75% of the code smell instances identified by HIST as actual design/implementation problems

    Conclave: ontology-driven measurement of semantic relatedness between source code elements and problem domain concepts

    Get PDF
    Software maintainers are often challenged with source code changes to improve software systems, or eliminate defects, in unfamiliar programs. To undertake these tasks a sufficient understanding of the system (or at least a small part of it) is required. One of the most time consuming tasks of this process is locating which parts of the code are responsible for some key functionality or feature. Feature (or concept) location techniques address this problem. This paper introduces Conclave, an environment for software analysis, and in particular the Conclave-Mapper tool that provides a feature location facility. This tool explores natural language terms used in programs (e.g. function and variable names), and using textual analysis and a collection of Natural Language Processing techniques, computes synonymous sets of terms. These sets are used to score relatedness between program elements, and search queries or problem domain concepts, producing sorted ranks of program elements that address the search criteria, or concepts. An empirical study is also discussed to evaluate the underlying feature location technique.info:eu-repo/semantics/publishedVersio

    Using the Conceptual Cohesion of Classes for Fault Prediction in Object-Oriented Systems

    No full text

    A Multi-Study Investigation into Dead Code

    No full text
    Dead code is a bad smell and it appears to be widespread in open-source and commercial software systems. Surprisingly, dead code has received very little empirical attention from the software engineering research community. In this paper, we present a multi-study investigation with an overarching goal to study, from the perspective of researchers and developers, when and why developers introduce dead code, how they perceive and cope with it, and whether dead code is harmful. To this end, we conducted semi-structured interviews with software professionals and four experiments at the University of Basilicata and the College of William Mary. The results suggest that it is worth studying dead code not only in the maintenance and evolution phases, where our results suggest that dead code is harmful, but also in the design and implementation phases. Our results motivate future work to develop techniques for detecting and removing dead code and suggest that developers should avoid this smell

    Are unreachable methods harmful? Results from a controlled experiment

    No full text
    In this paper, we present the results of a controlled experiment conducted to assess whether the presence of unreachable methods in source code affects source code comprehensibility and modifiability. A total of 47 undergraduate students at the University of Basilicata participated in this experiment. We divided the participants in two groups. The participants in the first group were asked to comprehend code base containing unreachable methods and implement five change requests in that code base. The participants in the second group were asked to accomplish exactly the same tasks as the participants in the first group, however, the source code provided to them did not contain any unreachable methods. The results of the study indicate that code comprehensibility is significantly higher when source code does not contain unreachable methods. However, we did not observe a statistically significant difference for code modifiability. From these results, we distill lessons and implications for practitioners as well as possible avenues for further research

    Combining Probabilistic Ranking and Latent Semantic Indexing for Feature Identification

    No full text
    corecore